Search CORE

17 research outputs found

A Direct Translation from XPath to Nondeterministic Automata

Author: David Claire
Francis Nadime
Libkin Leonid
Publication venue
Publication date: 01/01/2011
Field of study

Abstract. Since navigational aspects of XPath correspond to first-order definability, it has been proposed to use the analogy with the very successful technique of translating LTL into automata, and produce efficient translations of XPath queries into automata on unranked trees. These translations can then be used for a variety of reasoning tasks such as XPath consistency, or optimization, under XML schema constraints. In the verification scenarios, translations into both nondeterministic and alternating automata are used. But while a direct translation from XPath into alternating automata is known, only an indirect translation into nondeterministic automata- going via intermediate logics- exists. A direct translation is desirable as most XML specifications have particularly nice translations into nondeterministic automata and it is natural to use such automata to reason about XPath and schemas. The goal of the paper is to produce such a direct translation of XPath into nondeterministic automata.

CiteSeerX

Edinburgh Research Explorer

Run-Based Semantics for RPQs

Author: David Claire
Francis Nadime
Marsault Victor
Publication venue
Publication date: 23/11/2022
Field of study

The formalism of RPQs (regular path queries) is an important building block of most query languages for graph databases. RPQs are generally evaluated under homomorphism semantics; in particular only the endpoints of the matched walks are returned. Practical applications often need the full matched walks to compute aggregate values. In those cases, homomorphism semantics are not suitable since the number of matched walks can be infinite. Hence, graph-database engines adapt the semantics of RPQs, often neglecting theoretical red flags. For instance, the popular query language Cypher uses trail semantics, which ensures the result to be finite at the cost of making computational problems intractable. We propose a new kind of semantics for RPQs, including in particular simple-run and binding-trail semantics, as a candidate to reconcile theoretical considerations with practical aspirations. Both ensure the output to be finite in a way that is compatible with homomorphism semantics: projection on endpoints coincides with homomorphism semantics. Hence, testing the emptiness of result is tractable, and known methods readily apply. Moreover, simple-run and binding-trail semantics support bag semantics, and enumeration of the bag of results is tractableComment: 35 page

arXiv.org e-Print Archive

Schema Mappings for Data Graphs

Author: Francis Nadime
Libkin Leonid
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/05/2017
Field of study

Edinburgh Research Explorer

Asymptotic Determinacy of Path Queries using Union-of-Paths Views

Author: FN Afrati
Nadime Francis
Publication venue
Publication date: 01/01/2015
Field of study

International audienceWe consider the view determinacy problem over graph databases for queries defined as (possibly infinite) unions of path queries. These queries select pairs of nodes in a graph that are connected through a path whose length falls in a given set. A view specification is a set of such queries. We say that a view specification V determines a query Q if, for all databases D, the answers to V on D contain enough information to answer Q. Our main result states that, given a view V, there exists an explicit bound that depends on V such that we can decide the determinacy problem for all queries that ask for a path longer than this bound, and provide first-order rewritings for the queries that are determined. We call this notion asymptotic determinacy. As a corollary, we can also compute the set of almost all path queries that are determined by V

Crossref

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Dagstuhl Research Online Publication Server

Consistency of injective tree patterns

Author: David Claire
Francis Nadime
Murlak Filip
Publication venue
Publication date: 01/01/2014
Field of study

International audienceTesting if an incomplete description of an XML document is consistent, that is, if it describes a real document conforming to the imposed schema, amounts to deciding if a given tree pattern can be matched injectively into a tree accepted by a fixed automaton. This problem can be solved in polynomial time for patterns that use the child relation and the sibling order, but do not use the descendant relation. For general patterns the problem is in NP, but no lower bound has been known so far. We show that the problem is NP-complete already for patterns using only child and descendant relations. The source of hardness turns out to be the interplay between these relations: for patterns using only descendant we give a polynomial algorithm. We also show that the algorithm can be adapted to patterns using descendant and following-sibling, but combining descendant and next-sibling leads to intractability

INRIA a CCSD electronic archive server

HAL Descartes

Edinburgh Research Explorer

Dagstuhl Research Online Publication Server

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Datalog Rewritings of Regular Path Queries using Views

Author: Cristina Sirangelo
Luc Segoufin
Maurizio Lenzerini
Nadime Francis
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2014
Field of study

We consider query answering using views on graph databases, i.e. databases structured as edge-labeled graphs. We mainly consider views and queries specified by Regular Path Queries (RPQ). These are queries selecting pairs of nodes in a graph database that are connected via a path whose sequence of edge labels belongs to some regular language. We say that a view V determines a query Q if for all graph databases D, the view image V(D) always contains enough information to answer Q on D. In other words, there is a well defined function from V(D) to Q(D). Our main result shows that when this function is monotone, there exists a rewriting of Q as a Datalog query over the view instance V(D). In particular the rewriting query can be evaluated in time polynomial in the size of V(D). Moreover this implies that it is decidable whether an RPQ query can be rewritten in Datalog using RPQ views

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

A Researcher’s Digest of GQL

Author: Francis Nadime
Gheerbrant Amélie
Guagliardo Paolo
Libkin Leonid
Marsault Victor
Martens Wim
Murlak Filip
Peterfreund Liat
Rogova Alexandra
Vrgoc Domagoj
Publication venue
Publication date: 17/03/2023
Field of study

International audienceGQL (Graph Query Language) is being developed as a new ISO standard for graph query languages to play the same role for graph databases as SQL plays for relational. In parallel, an extension of SQL for querying property graphs, SQL/PGQ, is added to the SQL standard; it shares the graph pattern matching functionality with GQL. Both standards (not yet published) are hard-to-understand specifications of hundreds of pages. The goal of this paper is to present a digest of the language that is easy for the research community to understand, and thus to initiate research on these future standards for querying graphs. The paper concentrates on pattern matching features shared by GQL and SQL/PGQ, as well as querying facilities of GQL

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

HAL-Ecole des Ponts ParisTech

Graph Pattern Matching in GQL and SQL/PGQ

Author: Deutsch Alin
Francis Nadime
Green Alastair
Hare Keith
Li Bei
Libkin Leonid
Lindaaker Tobias
Marsault Victor
Martens Wim
Michels Jan
Murlak Filip
Plantikow Stefan
Selmer Petra
van Rest Oskar
Voigt Hannes
Vrgoc Domagoj
Wu Mingxi
Zemke Fred
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/12/2021
Field of study

As graph databases become widespread, JTC1 -- the committee in joint charge of information technology standards for the International Organization for Standardization (ISO), and International Electrotechnical Commission (IEC) -- has approved a project to create GQL, a standard property graph query language. This complements a project to extend SQL with a new part, SQL/PGQ, which specifies how to define graph views over an SQL tabular schema, and to run read-only queries against them. Both projects have been assigned to the ISO/IEC JTC1 SC32 working group for Database Languages, WG3, which continues to maintain and enhance SQL as a whole. This common responsibility helps enforce a policy that the identical core of both PGQ and GQL is a graph pattern matching sub-language, here termed GPML. The WG3 design process is also analyzed by an academic working group, part of the Linked Data Benchmark Council (LDBC), whose task is to produce a formal semantics of these graph data languages, which complements their standard specifications. This paper, written by members of WG3 and LDBC, presents the key elements of the GPML of SQL/PGQ and GQL in advance of the publication of these new standards

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Edinburgh Research Explorer

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

View-based query determinacy and rewritings over graph databases

Author: Francis Nadime
Publication venue
Publication date: 08/12/2015
Field of study

Les graphes de données sont naturellement utilisés dans de nombreux contextes incluant par exemple les réseaux sociaux ou le Web sémantique. L'information contenue dans la base de données se trouve alors aussi bien dans les données mêmes que dans la topologie du graphe, c'est-à-dire dans la manière dont les données sont connectées. Cela implique donc de considérer les questions traditionnelles en théorie des bases de données pour des langages de requêtes capables de parler des chemins connectant les nœuds du graphe. Nous nous intéressons en particulier aux problèmes de la déterminabilité et de la réécriture d'une requête à l'aide de vues. Il s'agit alors de décider si une vue de la base de données contient suffisamment d'information pour répondre entièrement à une requête sans consulter la base de données directement, et dans ce cas, d'exprimer explicitement la réponse à la requête à partir de la vue. Ce cadre rencontre de nombreuses applications, notamment pour l'intégration de données et l'optimisation de requêtes. Nous commençons par comparer ces deux questions aux autres problèmes de décision classiques dans ce contexte : calcul des réponses certaines, test de cohérence et mise à jour d'une instance de vue. Nous améliorons ensuite ces résultats dans deux cas spécifiques. Tout d'abord, nous montrons que pour les requêtes régulières de chemin, l'existence d'une réécriture monotone coïncide avec l'existence d'une réécriture dans Datalog. Puis, nous montrons que pour des vues s'intéressant uniquement aux longueurs des chemins du graphe, une notion plus faible de déterminabilité, appelée déterminabilité asymptotique, est décidable et résulte en des réécritures du premier ordre.Graph databases appear naturally in various scenarios, such as social networks and the semantic Web. In these cases, the information contained in the database lies as much in the data itself as in the topology of the graph, that is, in how the data points are linked together. This leads to considering traditional database theory questions for query languages that return data nodes based on the paths of the graph connecting them. We focus our attention on the view-based query determinacy and rewriting problems. They ask the question whether a view of the database contains enough information to fully answer a query without accessing the database directly. If so, we then want to express the answer to the query directly with regards to the view. This setting occurs in many applications, such as data integration and query optimization. We start by comparing these two tasks to other common task in this setting: computing certain answers, checking consistency of a view instance and updating it. We then build on these results in two specific cases. First, we show that for regular path queries, the existence of a monotone rewriting coincides with the existence of a rewriting expressible in Datalog. Then, we show that for views that only consider the lengths of the path in the graph, we can decide a weaker form of determinacy, called asymptotic determinacy, and produce first-order rewritings for the queries that are asymptotically determined

Theses.fr